The use of virtual reality in studying prejudice and its reduction: A systematic review

This systematic review provides an up-to-date analysis of existing literature about Virtual Reality (VR) and prejudice. How has VR been used in studying intergroup attitudes, bias and prejudice, are VR interventions effective at reducing prejudice, and what methodological advantages and limitations does VR provide compared to traditional methods are the questions we aim to answer. The included studies had to use VR to create an interaction with one or more avatars belonging to an outgroup, and/or embodiment in an outgroup member; furthermore, they had to be quantitative and peer-reviewed. The review of the 64 included studies shows the potential of VR contact to improve intergroup relations. Nevertheless, the results suggest that under certain circumstances VR contact can increase prejudice as well. We discuss these results in relation to the intergroup perspective (i.e., minority or majority) and target minority groups used in the studies. An analysis of potential mediators and moderators is also carried out. We then identify and address the most pressing theoretical and methodological issues concerning VR as a method to reduce prejudice.


Theoretical background
Virtual reality (VR) has been used for a variety of purposes, from video games to army training [1] or medical education [2]. In the last decade, it has received increasing attention particularly in social psychology with a range of other applications (see [3] for a meta-analysis), such as prejudice reduction. The great advantage of VR over regular computer-based simulation is the high degree of immersion: it creates a strong illusion of being in the computer-generated world [4]. As VR technology puts participants into a virtual world that can be controlled by the researcher, it allows both further development of known research paradigms with highly standardised experimental conditions and the creation of entirely new paradigms that were not possible before. In this review, we systematically evaluate studies which have utilised VR to study and combat prejudice.

Prejudice, intergroup bias, and intergroup attitudes
Prejudice generally refers to "any attitude, emotion, or behaviour toward members of a group, which directly or indirectly implies some negativity or antipathy toward that group" [5].

Malleability of intergroup attitudes
Prejudice reduction interventions can be divided in several ways based on their theoretical and methodological approaches. Paluck and Green [17] divide social scientific and psychological interventions into those that attempt to influence intergroup cognitions, emotions and behaviours (e.g., social categorisation and identities, intergroup contact and emotions) and those that attempt to support positive intergroup relations via educative (knowledge, critical thinking) or normative influence upon individual emotions or thoughts. The effectiveness of prejudice-reduction interventions seems to highly depend on the type of social stigmas, outcome measures and the target groups studied. To study the malleability of implicit racial bias, Lai et al. [18] studied a range of 17 different interventions in an experimental design. Their results showed that half (8/17) of the interventions were effective at reducing non-Blacks' implicit preferences for Whites compared to Blacks. In terms of the mechanisms, the most efficient interventions used counter stereotypes and evaluative conditioning methods or provided cognitive and behavioural strategies to override biases, while those inducing perspective-taking (including imagined contact), egalitarian value orientation, or positive emotion turned out to be inefficient. In their study, no intervention consistently reduced explicit racial preferences, and there were no signs of the extended effect of intervention towards "uncontacted" outgroups. Moreover, the effects of the interventions working with implicit bias also appear to be very short-lived. Lai et al. [19] further examined the stability of the effect of nine interventions and found that although they all immediately reduced implicit preferences, none of them was effective after a delay of several hours to several days.
Beelmann and Heinemann [20] have, in turn, conducted a meta-analysis of 81 studies containing 122 intervention-control comparisons of structured programs to reduce explicit negative intergroup attitudes in children and adolescents via intergroup contact, information/ knowledge acquisition, and promotion of individual social-cognitive competencies. Their results showed that interventions that were based on direct contact experiences and induced empathy and perspective taking showed the strongest effects. The effects also varied according to the program participant's social status, the target out-group, and the outcome measure. Interventions were less effective with emotional and behavioural measures of prejudice than when attempting to change the cognitive components of prejudice; prejudice towards disabled and elderly people was more malleable than towards ethnic minority group members, and attitudes of majority group members towards ethnic minority group members were more prone to the effect of intervention than vice versa (see also [21]). Notably, the majority of studies assessed by Beelmann and Heinemann [20] examined only immediate effects on intergroup attitudes with a mean effect size of around d = 0.30. Only ten out of 81 studies evaluated longterm effects and showed stable positive effects or even stronger positive effects over time (e.g., [22]). In this systematic review, we focus on the malleability of explicit and implicit intergroup attitudes via intergroup contact in VR. The volume of intervention studies based on contact theory and the empirical support of the effectiveness of contact-based interventions in prejudice reduction have outlined the potential of this method for prejudice reduction in VR.

Attitude change and malleability via intergroup contact
According to Allport's [23] contact hypothesis, positively engaging with outgroups is a fruitful way to improve intergroup relations. In this paradigm, intergroup contact has been originally defined as actual face-to-face interaction between members of clearly defined groups [24]. Hundreds of studies have demonstrated that intergroup attitudes can be changed and improved by creating positive contact between different groups (see meta-analyses by [3,24,25]). These attitude changes are mainly mediated by increased knowledge about the outgroup, reduced anxiety about intergroup contact, and increased empathy and perspective taking. Importantly, when Allport's [23] conditions for positive intergroup contact (equal status between groups in the situation; common goals; cooperation between groups; support of authorities) are met, prejudice reduction is strongest [24]. On the contrary, interventions can also backfire and lead to increased intergroup conflict if optimal conditions are not met [26].
Moreover, as a considerable corpus of recent research [27] has demonstrated, the positive effect of contact on intergroup attitudes is not limited to direct interactions but also emerges with mediated, extended or indirect contact [24,28,29], including online contact in general [30] and e-contact in particular [31], as well as imagined contact [32].
However, the potential of VR contact to reduce prejudice has not been properly evaluated. Amichai-Hamburger and McKenna [33], as well as Dovidio et al. [27], have stressed that online contact, including VR, might be particularly well suited for creating optimal contact conditions, because it creates an anxiety-safe and controlled environment. According to the meta-analysis by Lemmer and Wagner [3], in which they compared the effects of different direct and indirect forms of contact and which included only eight comparisons with virtual contact, virtual contact intervention programs showed tentative weak evidence for their usefulness in prejudice reduction. However, the constant development of VR technologies and the accumulating amount of studies utilising VR in prejudice research also continue to improve our understanding of VR as a platform to study and improve intergroup relations, as well as pose demands to evaluate the progress. In addition, there might be several moderators and mediators of the effect of VR contact on intergroup attitudes specific for VR contact. Reaching a better understanding of the features of VR contact can thus help to make it an important avenue for prejudice reduction endeavours.

Virtual reality for prejudice reduction
As previous studies on intergroup contact in VR differ a lot in their approaches and technological solutions, it is important to clarify what we refer to by VR studies and VR intergroup contact. According to Burdea and Coiffet [34], virtual reality is an immersive technology allowing the user to interact in real time with a 3D computer-generated environment simulating reality. One of its defining features is immersion in the environment, which is defined as the sensation of being there [35]. In VR, immersion is allowed by fully experiencing the simulated world through the senses of sight and sound, while the surrounding environment is not visible to the user [36]. This is closely related to the concept of embodiment, which we define by having full control over a virtual avatar, with the avatar's movements being coupled to the movements of one's physical body. This creates the illusion of ownership of the virtual body or perceiving the virtual body representation to be one's own body (see [37]). This sense of embodiment can cause the "Proteus effect", namely a change in people's behaviour and self-representation to match the identity of their virtual self [38,39]. Along with the high degree of immersion, the body ownership illusion allowed by embodiment makes VR users experience a stronger sense of spatial presence compared to the same environment in 2D [40].
How are VR's features serving research on prejudice reduction? Firstly, VR is a unique platform that can be used by researchers both to create and to study intergroup contact, as it enables the experience of direct and indirect intergroup encounters from both majority and minority perspectives. Importantly, VR combines a strong sense of a real social encounter combined with a high degree of experimental control, allowing researchers to ensure optimal conditions of intergroup contact [23]. Specifically, a recent meta-analysis on computer-mediated contact interventions indicated that online contact is typically characterised by a more equal status between groups compared to real-life contact [30], a crucial condition for prejudice-reduction. Given that VR is, on the one hand, computer-mediated but, on the other hand, more realistic than other online encounters, it is important to acknowledge and determine whether VR contact functions the same way or even better than real-world interventions and if so whether the positive effects of VR contact transfer to real-world encounters.
Another aspect that makes VR useful for the study of intergroup processes is that it allows constructing intergroup contact as experienced from both the minority and majority group perspective. VR research achieves this by using embodiment in two ways: either to enable the subject to embody an avatar belonging to the ingroup (usually majority group, as minority respondents have barely been studied), or to embody them in an avatar belonging to the outgroup (most often a stigmatised minority group for the same reason). VR can thus take perspective-taking interventions one step further by allowing the embodiment of avatars representing outgroup members, coming closer to literally taking their perspective.
A widely used alternative to "true" VR is simulating a first-person experience from the point of view of either the ingroup (i.e., majority) or the outgroup (i.e., minority group), without employing an embodied avatar. This kind of virtual experience has a similarly high degree of immersion and realism, but reduced feelings of body ownership. While both perspectives can be used to simulate intergroup contact in VR, embodiment in a minority avatar can elicit attitude change even without any additional virtual intergroup contact, by allowing the participant to "put themselves in the shoes of an outgroup member", to the extent it is possible to "live" the experience of an outgroup member.
However, given that empathy and perspective-taking have also been shown to have ironic effects in intergroup contexts leading to more helping behaviour and paternalistic attitudes on the expense of willingness to combat prejudice and inequality [41], the question remains, which form of VR contact-intergroup interaction or embodiment of an outgroup memberis more useful in prejudice reduction.
Finally, VR allows studying participants' behaviour in situations, which, for ethical and/or practical reasons, could not be studied in real life settings, such as helping behaviour in emergency situations or intergroup contact with the most vulnerable or isolated populations. Given all of the above reasons for VR's potential, it is thus important to evaluate whether VR is a powerful tool to combat group-specific biases.

Structure of the systematic review
As the number of studies capitalising on the potential of VR to improve intergroup attitudes increases, the need arises for an overview of research about prejudice and means to combat it in VR. In this systematic review, we draw an exhaustive analysis of existing research describing how virtual reality has been used up to date to study and shape intergroup attitudes through virtual intergroup contact.
Furthermore, we discuss the methodological advantages VR introduces compared to traditional methods and naturalistic interventions, and we seek to provide a critical analysis of the challenges and limitations faced by scholars studying intergroup relations in VR.
As already noted above, most research on prejudice and intergroup contact in general and in VR in particular has so far focused on the majority's attitudes towards members of socially stigmatised groups such as ethnic minorities. In addition, attitudes based on age, disabilities and gender have been widely addressed. However, intergroup relations with different minority groups follow different dynamics as power relations in a society are hierarchically organised so that the prejudice towards some social groups is more normative than towards some others [42]. For example, a meta-analysis of 27 studies with 31 treatment arms by Paluck et al. [28] shows that contact-based interventions directed at ethnic or racial prejudice have generated substantially weaker effects than those targeted towards other social prejudices. For said reasons, in order to assess the potential of VR to reduce prejudice towards different stigmatised groups, we follow the classification of prejudiced or stigmatised groups adopted by Christofi and Michael-Grigoriou [43], which in turn is based on Goffman's [44] categorization of stigma as an individual attribute that causes society to reject those who are affected by it. Thus, we classify studies included in this review based on the type of stigma by which the target outgroup is affected: overt or external deformations (i.e. physical or age-related stigma), deviations in personal traits (i.e. stigmatising behaviours, health status or disorders), and tribal stigmas (i.e. deriving from ethnic or socio-cultural background). Due to the intersectional nature of some stigmatising characteristics (e.g. gender) that lead them to fall into more than one category, we introduce intersectional stigma as a further stand-alone category.
This systematic review is structured as follows: we first describe the method used to review and include eligible studies; then we analyse the results based on the intergroup perspective adopted in the studies, and successively according to the target stigmatised group; we then proceed to review the mediators and moderators that have been investigated to understand the effect of VR contact on prejudice. Next, we provide an overview of the limitations and advantages of VR for prejudice reduction. We lastly discuss and summarise the findings and address future research.
Subsequently, we lay out the following research questions:

Method
Before beginning our literature search, we pre-registered this review in the PROSPERO database (https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=222294). We report all deviations from the pre-registered plan in this review article.

Information sources and search strategy
Based on the field of interest (i.e., intergroup contact and prejudice in VR), we developed 16 search terms related to virtual reality and intergroup relations (the final search strategy can be found at https://mfr.de-1.osf.io/render?url=https://osf.io/rp3wg/?direct%26mode=render% 26action=download%26mode=render) and used these to search published studies in three electronic databases (PsycInfo, Scopus, and Web of Science). The search terms were the following: (Vr OR virtual reality OR immersive virtual environment OR simulation-based assessment OR virtual reality exposure therapy OR virtual OR augmented reality) AND (intergroup relations OR ingroup outgroup OR prejudice OR discriminat � OR bias OR stereotyp � OR stigma � OR intergroup attitude � OR outgroup attitude � ). Searches were limited to human samples and articles published in English, German, Finnish, or Italian. The original search was conducted in March 2021 and updated and expanded during the revision process in January 2022. Once duplicates were excluded, 15,504 citations remained for screening. After title and full-text screening, we searched the reference sections of all included articles for further eligible studies, and contacted all authors of included studies for further articles and unpublished data. The search resulted in 64 studies (51 published journal articles, 11 conference papers, and two dissertations), of which 4 were provided by authors. For more detailed information about the search and screening process, see the PRISMA flow diagram reported in Fig 1.

Inclusion and exclusion criteria
After initial search, we carefully investigated the methods used in each study using the following criteria. Included studies had to use immersive virtual reality (IVR; see e.g. [4]), that is participants had to wear a head-mounted display, be in a room that was "transformed" into a virtual environment with projectors, or use a device to induce augmented reality (we had not pre-registered augmented reality studies but decided to include them due to the high degree of similarity). We excluded qualitative studies and opinion pieces. Moreover, we did not include studies that presented the virtual content on a computer screen. In the virtual reality, participants had to either embody an avatar representing the outgroup or take the perspective of a social group they did not belong to, and/or get in contact with at least one member of an outgroup.
In case of intervention studies, control groups would either receive an intervention to reduce prejudice other than in virtual reality (e.g. real-life interaction, perspective taking exercise, using non-immersive technology), or they would embody or interact with virtual ingroup members (i.e. only intragroup, but no intergroup contact), or would not experience any kind of intergroup contact.
Studies had to report at least one measure of intergroup bias, such as different measures of implicit and explicit prejudice, stereotypes, physiological measures associated with prejudice, or behavioural measures such as physical proximity.

Data extraction
Two independent researchers (MT and MA) conducted all literature searches, screened titles, abstracts, and full-text articles. Selection was such that after each step, any title or abstract that was deemed relevant by either researcher was included in the next step. Both authors then agreed on the final set of included articles. In case of disagreement (k = 8 articles), a third author (IJ-L) made the final decision. The final set of included articles constituted 64 studies reported in 62 independent articles.
Extracted data included publication year and language, study design (within or between participants), country of research site, sample characteristics (size, average age, gender composition, ethnicity), the VR medium and apparatus (VR headset, augmented reality, virtual world projected into a room), the group that was the prejudice target, how outgroups were represented (3D video, virtual agents, avatars, embodying an outgroup avatar), how the contact was designed, the intergroup bias measure, examined hypothesised mediators (such as empathy, gratitude, inclusion of other in the self) and moderators (such as socio-economic status). The extracted data can be seen in Table 1.
We applied the Cochrane Collaboration's tool [45] to assess risk of bias for all included studies. The risk of bias in each study was judged as high, low, or unclear, on each of the following domains: selection bias, performance bias (on experimenter and participant level), detection bias, attrition bias (on participant level as well as outcome level), and reporting bias. Both authors independently assessed the risk of bias on all studies and disagreements were solved through discussion between the coders. Appendix 1 provides an overview of the distribution of risks of bias (low, unclear, high) across studies.

Descriptive results: Studying prejudice in VR
We found altogether 64 eligible original studies. A list of the included studies can be found in Table 1. Of the 64 studies included in the review, 10 are observational, that is, their main aim is to assess intergroup attitudes in VR, with the remaining studies delivering interventions aimed at decreasing participants' prejudice. All of the 10 observational designs fall into the majority perspective classification, and either focus on tribal stigma (7 studies) or intersectional stigma (3 studies). Only four of the included studies encompass longitudinal measures. The majority (k = 38) of included studies use a between-subjects design, 16 studies a within-subjects design, eight studies opt for a mixed within-between subjects designs, and finally two studies did not employ any control condition.
It is worth pointing out that 32 of the included studies, accounting for half of the total number, was published between 2020 and 2022, showing the growing interest and the rapid progress of research in the field. Of the remaining studies, 23 were published between 2015 and 2019, and 9 before 2014, the earliest publication being dated in 2007.
In terms of intergroup contact, in 28 of the included studies, participants are embodied in an avatar resembling an outgroup member, for example White participants being embodied in a Black avatar or male participants in a female avatar. 18 studies have participants interacting with a virtual agent, that is a virtual character steered by the computer, while only three studies use an avatar controlled by the experimenter. 13 studies present participants with 3D videos and in two studies, participants use augmented reality that enhances the perception of the real world with optical and auditory hallucinations as experienced by people living with schizophrenia.
The most commonly studied outcomes are explicit (k = 37) and implicit attitudes (k = 25) towards the target group with some form of the Implicit Association Test [11] being the most commonly used measure for the latter. Following the narrative of VR being an "empathy machine" [46,47], many studies include measures of empathy, sympathy, pity, self-other overlap, or willingness to help in the future (k = 19 studies using at least one such measure). Neurophysiological measures like heart rate, skin conductance, electroencephalography (EEG) or functional magnetic resonance imaging (fMRI) are applied much more rarely to examine physiological or neural activation patterns of prejudice during the experiments (k = 5). It is noteworthy that across studies, a wide range of different questionnaires, tasks, and measures is administered with seemingly no emerging standards in the field.
Lastly, 30 out of 64 studies control for the successfulness of the VR experience in terms of immersion, body ownership, or spatial presence, while the remaining ones failed to measure any related variable.

The effect of intergroup contact in VR on intergroup attitudes
The following paragraphs describe results from the included studies with a specific focus on the kind of contact experience created in VR, the kind of stigma that the outgroups represented, the outcome measures used to examine prejudice and intergroup bias, and the psychological mechanisms and moderating variables that have been studied. In our presentation of results, we place a special emphasis on key studies that we consider as positive examples in terms of rigorous methods and the results of which appear more trustworthy.

Types of contact.
Two major forms of contact emerge from the analysed 64 studies: from the ingroup perspective, which is to say that participants belonging to a majority group interact in VR with avatars or virtual agents representing a stigmatised outgroup (k = 28); and from the outgroup perspective, meaning that the participant belonging to the majority group lives the virtual experience from a minority outgroup member's perspective (k = 36). In both types of designs, the subject does not always steer a virtual body: in some cases (k = 13), participants experience contact from a non-embodied virtual perspective using 360˚videos, like for example in studies by Hasson et al. [48] and Lesur et al. [49]. Given that experiencing contact from the outgroup perspective is unique to VR, embodiment is by far the preferred method in the designs from a minority perspective (aside from a few exceptions, e.g. [49]), while those using a majority perspective are more often relying on a disembodied point of view (e.g. [50,51]). For a more precise overview of the methods chosen by each study, see Table 1.
When it comes to studies aiming at assessing prejudice in VR, only 3 out of 16 exploit the point of view of a minority group member. All of them seem to suggest that embodying an outgroup (minority group) member leads to positive outcomes such as increased empathy [52,53] and less implicit [53,54] and explicit [53] prejudice. Of the remaining 13 using the majority group perspective, most show that real-world prejudice can also be demonstrated in VR. For example, a pioneering randomised controlled trial by Dotsch and Wigboldus [55] has used behavioural (the distance participants kept from avatars) and physiological (skin conductance responses) measures to show that participants exhibited higher prejudice towards Moroccan virtual agents rather than White ones.
In studies that adopt the majority perspective approach to prejudice (k = 28), both implicit and explicit measures of intergroup attitudes have been used providing somewhat inconsistent results. While some evidence [56,57] seems to suggest a decrease in prejudice towards minority outgroups following intergroup contact in VR, others [58][59][60] fail to obtain any significant change.
Among the intervention studies aiming at prejudice reduction through the majority perspective, two randomised controlled trials emerge [59,71]. While they both resort to explicit measures only, they have discording findings in that Kuuluvainen and colleagues [59] fail to find any improvement in intergroup anxiety after exposing White participants to virtual intergroup contact with a Middle Eastern man, compared to exposure to the same material in 2D. On the other hand, Peña et al. [71] have shown that participants who contacted a virtual outgroup member while embodying an avatar that resembled themselves, reported increased social distance towards the contacted political outgroup.
The results of the studies that adopt a minority perspective as a strategy to reduce prejudice (k = 36) are similarly mixed. For example, a randomised controlled trial exclusively targeting police officers [72] analysed participants' behavioural responses after being embodied in a Black suspect that was abused by another police officer, and eventually found greater helping behaviour up to one month after the VR experience. Banakou et al. [73] also show that experiencing the world from the minority outgroup's (i.e. Black people) perspective improves attitudes towards that group, and so do Peck et al. [74], Salmanowitz [75], Christofi et al. [76], Chen et al. [77,78], Chowdhury et al. [79], Tong et al. [53,80], and Zhang et al. [81]. However, other studies contest these findings by showing that intergroup contact experienced from the minority perspective does not necessarily have any effect on intergroup attitudes. Lastly, some scholars have found that impersonating an outgroup member may even worsen intergroup attitudes [82][83][84][85][86].
While more investigation is needed to understand the underlying mechanisms leading to increased prejudice in VR following the embodiment in an outgroup member, there is initial evidence suggesting that experiencing unpleasant circumstances in the skin of an outgroup member may lead to worsened attitudes rather than improved perspective taking towards the stigmatised outgroup. In the reviewed set of studies this was the case when participants experienced the point of view of people affected by Asperger syndrome [83], experienced schizophrenia symptoms simulated by augmented reality [84], or when White participants were embodied in a Black avatar [82]. To further validate this hypothesis, Banakou et al. [82] suggest that when participants experience negative affect while embodying outgroup members, their implicit bias against that group increases. The negative affect condition in this methodologically sound study was implemented as virtual passers-by displaying negative facial expressions, staring right at the participant, and changing direction to avoid participants. Whereas Kishore et al. [72] reach opposite conclusions, it is worth pointing out that their trial had a significantly smaller sample size, which makes those findings less reliable. We will next examine potential target-specific effects of VR contact more thoroughly, and then move to mediators and moderators of the effect of VR contact on prejudice in section 4.2.4.

Types of stigma.
Of the 64 articles found eligible for inclusion, 31 focus on tribal stigma, with most studies focusing on contact with avatars representing outgroups of African ethnic background. 13 studies deal with deviations in personal traits (e.g. schizophrenia, HIV, substance abuse), and 8 with stigma deriving from overt or external deformations. Of the latter, 4 target elderly people, 2 people with physical disabilities, and 2 individuals with obesity. Finally, among the 12 studies targeting intersectional types of stigma, one tackles prejudice towards transgender people, and the remaining ones towards women.
Different correlational designs exploring intergroup bias [55,61,87] confirm persisting bias against people with African background in VR, including studies that show persistence of the "shooter bias" (i.e. that participants tend to shoot more often and faster at Black rather than White targets in ambiguous shooting situations, [88]) in VR [62,66,68].
However, there is also evidence that both intergroup contact [56,72] and embodiment in an outgroup avatar [74,77,78] in VR can successfully be used to decrease racial bias. Furthermore, evidence by Hasler et al. [56] and Hasson et al. [48] suggests that the effect of VR contact is not specific for interracial attitudes, but can also improve in other, critical intergroup conflict situations: both studies showed that Jewish Israelis' attitude towards Palestinians could be improved using VR. While participants in Hasler et al.'s [56] study achieved this through a discussion with an outgroup avatar, Hasson and colleagues [48] obtained positive results using 3D videos to present the outgroup's perspective.
While positive VR interaction has shown its potential in reduction of racial prejudice, as already discussed above, there is some contrasting evidence of the effect of embodiment in a racial minority group member in VR on intergroup attitudes. Namely, there are results showing that embodying Black avatars can also lead to worsened implicit attitudes in White participants [89]. The previously mentioned study by Banakou et al. [82] shows that negative contact conditions when embodying a Black avatar and the associated affective reaction can be one explanatory factor for this effect. On the other hand, Kishore et al. [72] find that being embodied in a Black avatar targeted by discriminating behaviour leads to greater helping behaviour. Finally, two trials using either embodiment in an outgroup avatar [90] or 3D videos of an interaction with a Middle Eastern man [59], fail to find any improvement in intergroup attitudes compared to the control group.
When it comes to deviations in personal traits, Toppenberg et al. [51,69] show that implicit bias towards people living with HIV persists even in VR, and that evaluations were more positive when they perceived responsibility for the condition was low. While Tong et al. [53,80] take it a step further, proving that being embodied in chronic pain patients improve selfreported attitudes and willingness to help, contrasting evidence is brought by designs using augmented reality to simulate schizophrenia symptoms. Interestingly, while de Silva et al. [91] show increased empathy towards schizophrenic patients following an augmented reality experience, Kalyanaraman et al. [84] suggest that such embodied experience may lead to a desire for keeping a greater distance towards them. Stelzmann et al. [70] also find stronger stigmatisation of people with schizophrenia after facing an outgroup member in a 3D video. Hadjipanayi and Michel-Grigoriou [83] reach similar conclusions following embodiment in people with Asperger syndrome. Interestingly, Peña et al. [71] suggest that embodying an avatar that physically resembles the self leads to increased social distance towards a contacted political outgroup. Finally, Yuen et al. [92] fail to find any difference between VR embodiment in a subject with depressive symptoms compared to text-based perspective-taking.
As far as stigma due to overt or external deformations is concerned, Persky and Eccleston [67] show that obese virtual patients are object to prejudiced treatment when dealing with health professionals. Chowdhury et al. [79] find a decrease in prejudice towards wheelchair users following virtual embodiment. Moreover, a contrasting trend is shown by Banakou et al. [93], who have found embodiment in an elderly individual with high IQ improves implicit attitudes toward elderly people, and Oh et al. [94], whose subjects did not show any improvement in attitudes after being embodied in an elderly woman.
Lastly, designs dealing with intersectional stigma highlight the same pattern of mixed evidence when it comes to interventions. Indeed, while some fail to find any positive effect of intergroup VR contact [49] and embodiment in an outgroup member [95], others show improved attitudes can be a result of both methods [57,81]. On the other hand, two studies [85,86] suggest that embodying male individuals in female avatars may also lead to the deterioration of implicit attitudes, even when the performed task is not supposed to elicit any negative affect (i.e. a Tai-Chi class). Observational studies on intersectional stigma confirmed the endurance of gender-based bias [63,64], and bias based on sexual orientation [96].
The results above reinforce the hypothesis that factors such as the degree of immersion and the valence (positive or pleasant vs negative or unpleasant) of the embodied experience may play a primary role in the success or failure to reduce prejudice following embodiment in an outgroup member.

Outcome measure.
Given the previously discussed and widely studied differences between explicit and implicit measures of prejudice, it is worth discussing the findings also on the basis of their outcome measures. We chose not to focus on results obtained through physiological and neurological measures, due to them being severely underrepresented in the included studies. Specifically, only 5 of the included studies include a neurophysiological measure of prejudice, as compared to 37 that assess explicit, and 25 implicit attitudes. Of those, 10 assess prejudice with both implicit and explicit measures.
Twelve out of 25 studies examining implicit intergroup attitudes represent intervention studies and rely on IAT to assess intergroup bias. Of those, Lopez et al. [85] and Schulze et al. [86] found that implicit attitudes further deteriorated after the intervention, while Banakou et al. [73,93], Peck et al. [74], Starr et al. [54], and Zhang et al. [81] highlight a clear improvement in implicit attitudes following exposure to VR contact. Notably, all intervention studies assessing implicit attitudes are based on embodiment of an outgroup member.
By contrast, sixteen studies enacting bias-reducing interventions exclusively used explicit measures. A considerable number of them found a decrease in prejudice following embodiment in an outgroup member [52,53,[76][77][78]80,92]. Two studies by Peña et al. [71] and Steltzmann et al. [70] conversely found increased levels of prejudice after engaging in virtual intergroup contact with an outgroup member, and Hadjipanayi and Michel-Grigoriou [83] and Kalyanaraman et al. [84] obtain similar results through embodiment of an outgroup member.
Lastly, sixteen intervention studies include both implicit and explicit measures of prejudice, of which nine focus on a majority perspective. Whereas three of them report a decrease in intergroup bias assessed through implicit measures after embodiment of an outgroup member, but no significant change in explicit ones [75,82,97], Breves [40] only found a decrease in prejudice through explicit measures, but no effect on implicit ones. Finally, Groom et al. [89] show that embodying an outgroup member in a work interview leads to worse implicit attitudes but has no effect on explicit ones. No intervention study taking into consideration both implicit and explicit measures of prejudice has found converging results, but as already pointed out earlier, these two types of measurements are often discordant, most likely due to social desirability effects (for an overview of the discussion about the discordance between implicit and explicit measures, see e.g., [13,[98][99][100][101]). In addition, most studies employing implicit measures used embodiment of an outgroup member as an intervention which might be more likely to change implicit rather than explicit attitudes. In summary, it seems like implicit measures unveil potential effects of VR-based interventions that might not appear in explicit measures of intergroup attitudes.

Mediators and moderators.
It is widely established that intergroup contact reduces prejudice through both affective (i.e. empathy, intergroup anxiety) and cognitive mediators (i.e. perspective taking, increased familiarity, and knowledge; see [25] for a review). Fourteen studies encompass an analysis of potential mediating mechanisms explaining the effect of VR contact on prejudice. Among those, six are observational studies. Two of them suggest that physiological measures have great potential to elucidate mechanisms accounting for prejudice in VR. Specifically, [61] have found an association between EEG-measured alertness and attitudes, while Dotsch and Wigboldus [55] have observed that measures of skin conductance are correlated to implicit attitudes towards the target minority. On the other hand, regarding potential psychological mediators, Eiler [62] have found no mediation of perceived threat on prejudiced behaviour, nor Bielen et al. [87] of concern about terrorism when judging minority defendants in a court trial.
When it comes to prejudice-reducing interventions, evidence emerges that the positive effect of VR contact is due to emotional mediators, such as feeling more closeness to the prejudiced target [76] and perceiving them as warmer [58]. Empathy has also been found to be a mediator of VR contact when it comes to embodying an outgroup member [77]. Furthermore, Hasler et al. [102] interestingly show that feelings of presence in VR had a mediating effect on the negative affect toward the majority ingroup, when experiencing a conflict scenario from the outgroup's (minority) perspective. Lastly, Peña et al. [71] showed that inducing identity salience does not mediate changes in prejudice.
Thirteen studies include moderation analyses. Among the ones delivering interventions, few studies investigated individual differences as moderating variables. Christofi et al. [76] have found that differences in trait empathy moderate the improvement of attitudes towards the outgroup, with individuals higher in empathy showing less bias after VR contact than those low in empathy. Additionally, two studies investigated social identification as a moderating variable on the effects of embodying an outgroup avatar. Chen et al. [77] show that participants generally placing greater importance on their various group memberships show stronger intervention effects, namely greater increase in self-other overlap with the embodied ethnic outgroup. Starr et al. [54] suggest that higher identifiers with the embodied avatar experience greater decrease in intergroup bias.
When it comes to moderators linked to specific features of the VR experience, Chowdhury et al. [79] interestingly found that a disabled narrator led to greater decrease in prejudice against disabled people, when embodying a wheelchair user. In addition, Banakou et al. [82] show that the valence of intergroup contact while embodying an outgroup member moderates the change in attitudes towards the embodied minority (with more positive contact resulting in a more positive change), while the number of exposures to the same kind of embodimentbased intervention does not [73]. Finally, Peña et al. [71] found that participants customising their own avatar to look like themselves eventually expressed desire for greater social distance from the contacted outgroup following the interaction.
The observational studies using the shooter bias paradigm revealed no effect of distance and armed status on difference of shooting behaviour towards majority or minority members [62], but a moderation effect of socioeconomic status (SES) on shooting accuracy [103], with subjects making fewer mistakes when facing high-SES targets.

Risk of bias assessment
The risk of bias assessment (see Table 2 for the detailed risk of bias assessment and S1 Table in S1 Checklist for the overview) showed that a large proportion of studies did not report specifically how participants were assigned to conditions. Relatedly, it was also often unclear to what Table 2. Risk of bias assessment for all studies following the Cochrane Collaboration's risk of bias tool (Higgins et al., 2019). As detection bias is assessed for each outcome separately, we classified the different outcomes when more than one was reported in a study and rated the risk of bias for each class of outcomes (e.g. IAT; self-reports; physiological recordings) as low, unclear, or high.               degree experimenters and participants were aware of which condition participants were assigned to, inducing risk for performance biases in both participants and experimenters. Judging from the written reports, risk for performance bias in participants was deemed to be high or unclear in over 80% of studies. This was mostly because it was not clearly reported whether participants could have guessed the purpose of the study and/or which condition they were assigned to.
The overall low risk of reporting and attrition bias is worth positive mention: most studies reported null results for at least some of the assessed variables. However, without pre-registration, it is impossible to assess whether further variables were assessed but not reported.

Discussion
First and foremost, this systematic review shows that VR is not a social vacuum but a virtual environment enabling co-creation and modification of social reality. The review thus clearly indicates that features of prejudice in situated social environments persist also in VR, while also showing how VR is turning into a valuable resource for studying intergroup attitudes and their change through intergroup contact. The distinguishing features of immersiveness, body ownership and embodiment provide VR with a considerable potential for stimulating perspective taking, which has been shown to be an important mediator in prejudice reduction [25], and simulating highly realistic social environments.

Overview and future research directions
The existing literature has used either a majority perspective to intergroup contact (i.e. embodiment in an ingroup member), or minority perspective (i.e. embodiment in an outgroup member). The latter option fully exploits the distinguishing features of VR, as it allows a highly realistic experience from the perspective of a stigmatised minority member. Existing evidence is nevertheless contrasting: while studies employing the majority perspective have shown solid potential to decrease prejudice towards stigmatised minority groups, studies using the minority outgroup perspective show that embodying an outgroup member can either lead to reduction of or increase in prejudice. Studies using both explicit and implicit measures of intergroup attitudes seem to indicate that implicitly assessed biases are more likely to change from embodying an outgroup member. This might relate to the rather visceral experience of "being an outgroup member" which might serve to associate positive self-evaluation with that outgroup.
While there has been so far little attention to mediating and moderating mechanisms of prejudice reduction in VR, preliminary evidence [82,84,86] suggests that this could depend upon the affect elicited during the embodied experience, in line with earlier evidence that negative affect during intergroup encounters can increase implicit bias [104]. Living negative affect in the body of a minority member could indeed lead to withdrawal behaviour and worsened attitudes toward said minority, underlining the importance of understanding the affective determinants of intergroup attitudes. Nevertheless, the results obtained by Kishore et al. [72] seem to contrast said pathway, since embodying a Black avatar that experiences discriminatory behaviour by a White police officer was found to increase participants' helping behaviour. Thus, one potential reason for this seemingly incompatible set of findings is the nature of the chosen outcome measures: while negative experiences associated with living a minority group perspective in VR might lead to defensive tendencies to distance oneself from the minority outgroup's reality and thus negatively influence implicit outgroup evaluations, the VR experience might positively affect behaviour through other routes of processing than implicit associations, such as through moral evaluations activating various aspects of empathy.
Intervention designers therefore face a dilemma: on the one hand, they want to give majority participants an experience that reflects that of a stigmatised minority member as accurately as possible to induce empathy and moral considerations about discrimination by providing an understanding of "what it is like to be that person"; on the other hand, if the experience elicits strong negative affect, this might lead to more negative attitudes (at least on an implicit level) which can lead to more discriminatory behavior in the future [105]. To what degree prolonged and/or repeated exposure to embodiment interventions could also lead to explicit attitude change remains an open question but could be hypothesised from theoretical models of attitude change [106] The fact that an intervention works differently on explicit and implicit attitudes is not unique to VR interventions (e.g. [107,108] and relates to the general divergence of implicit and explicit attitudes and their relative contribution to behaviour, a much-debated issue in (social) psychology (see e.g. [13,[98][99][100][101]). It also underlines the importance of selecting outcome measures that align with specific research questions: if, for example, the main aim of an intervention is to combat discriminatory behavior, such behavior should also be the main outcome measure. However, few of the examined studies have included actual behavior as an outcome, again reflecting a larger issue in psychological science [109].
The stigmatized targets in the included studies represented a wide variety of minority groups, such as ethnic minorities, gender and sexual minorities, obese people, neurological patients, elderly people, drug users, and more. Similarly to the previously discussed perspective, the results lead to infer that regardless of the target group, using VR to embody an outgroup member can both improve intergroup attitudes and deteriorate them. The latter effect also seems to be more prevalent in research designs using embodiment in an outgroup member. The fact that VR experiences sometimes lead to more negative attitudes towards outgroups poses a significant challenge for this research field, given that interventions should always follow a "first, do no harm" principle. Identifying specific factors that contribute to deterioration of outgroup attitudes must therefore be a major focus of future research. This is also relevant from an applied perspective: designers of VR games, for example, should be aware which game features might contribute to an increase in intergroup biases.
Those studies that explicitly examined moderating variables point to the importance of two participant-level factors that should be considered in a study design: on the one hand, participants differ on traits that make them more or less susceptible to effects of any prejudice-reduction intervention, such as empathy [76] or the importance they generally place on group memberships [78]. On the other hand, participants' involvement in and identification with the VR experience can contribute to more desirable effects [54]. These two factors might well be interconnected and these complexities should be considered in intervention designs. Relatedly, some emerging evidence suggests that the immersiveness of the experience may influence the effectiveness of interventions to reduce intergroup bias from the perspective of a stigmatised ethnic minority [102]. Taken together, at least part of the effect of immersiveness might thus be due to participants being better able to identify and get involved with the VR experience. In addition, the full experience of body ownership and identification with the embodied avatar seem to be critical, though understudied, moderators of the effect of VR contact on prejudice.
Despite the urge of gathering more insight on mechanisms specific to VR that could explain findings (e.g. immersion, body ownership, etc.), it is undeniable that this emerging method has great potential to study and reduce prejudice. Nevertheless, the evidence collected up to date calls for further investigating the role of affect in influencing changes in attitudes when embodying an outgroup member. Indeed, while empathy and perspective-taking have established roles as emotional and cognitive mediators in prejudice reduction, there are also other affective and identity-related factors that have been seen to have a powerful influence on the contact effects on intergroup attitudes such as intergroup anxiety, threats, morality, contact motivation and others (e.g. [25]) and that need to be included also into VR contact paradigms.

Methodological issues and advantages
Considering the novelty of VR as a method to investigate and act on prejudice, there is still great heterogeneity and discordance on the best practices to adopt. First and foremost, the majority of the included studies (n = 34) have not controlled for the successfulness of the VR experience in terms of immersion, body ownership, or spatial presence. Given the centrality of such mechanisms in ensuring the illusion of being there [35] and perception of the virtual body as the subject's own [37], the absence of such experimental checks is a considerable limitation.
The variability of methods and lack of clarity in the operationalization of embodiment is also an emerging issue, as a significant number of studies do not provide the participants with a virtual body, but limit the VR experience to a "first person point of view", regardless of the degree of interactivity allowed in the design. Whereas this is usually the case with studies using 360˚videos, it sometimes occurs in fully computer-designed environments as well (e.g. [81,96]). Given that not owning an avatar undermines feelings of body ownership by definition, it is a particularly important issue, especially in case of interventions based on embodying an outgroup member.
Rather strikingly, just one of the included studies [90] used technology to create VR-mediated contact between avatars steered by real members of different groups. Instead, the studies reviewed here have largely focused on interactions with computer-controlled virtual agents or avatars controlled by an experimenter. VR would seem like an ideal extension of computermediated or e-contact [31] that would not be limited to e.g. text-based contact but would be much closer to actual, real-world contact between majority and minority group members. It is well possible that intergroup contact in VR equalises status between groups, as seems to be the case in computer-mediated contact [30]. Few studies, however, have measured or even taken into consideration Allport's positive contact conditions [23], namely equal status, shared goals, intergroup cooperation, and support by institutions or authorities. Without said conditions, interventions aimed at reducing prejudice through positive intergroup contact in VR would fail at eventually enabling positive contact, diminishing the potential to obtain positive contact effects or even laying the foundations for negative contact effects to occur. Future studies should therefore explore VR's potential to create optimal conditions for intergroup contact to reduce prejudice [24].
Moreover, as previously observed by [43], there is a general underuse of physiological and neurological measures on the one hand, and behavioural measures on the other hand, in favour of self-reported ones. To provide a more robust test of their interventions and to overcome limitations related to social desirability of explicit measures of prejudice, many studies have indeed complemented explicit measures with the measures of implicit attitudes such as the IAT, which we discussed in the section above.
As a final remark, the amount of detail reported is generally insufficient when it comes to experimental procedures, such that it was hardly possible to evaluate the degree of bias (see Appendix 1). Further, pre-registration was rare, making an assessment of possible outcome omissions impossible. On a positive note, recent studies tended to employ more sophisticated methods than earlier examples, indicating that the field is moving from initial proof-of-concept and pilot studies to more rigorous, systematic evaluations of interventions aimed at reducing prejudice.